Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce redundant code in CUDF JNI #10019

Merged
merged 15 commits into from
Jan 25, 2022

Conversation

mythrocks
Copy link
Contributor

@mythrocks mythrocks commented Jan 11, 2022

This commit cleans up some of the redundancy in CUDF JNI native code, and modifies some of the unique_ptr handling:

Improvements

  1. Added utility function to convert from native_jpointerArray<column_view> to vector<column_view>. Removed redundant code in stringConcat functions.
  2. Added utility to convert from unique_ptr<column> to jlong. This eliminates variables from nearly all the JNI functions.
  3. Added utility functions for extracting std::vector from native_jArray classes. This reduces raw loops in TableJni.cpp.
  4. Eliminated unnecessary use of unique_ptr in locations where they are immediately release()d. (E.g. ColumnVectorJni::getNativeColumnView().)
  5. Removed unnecessary std::move() calls.
  6. Removed unnecessary reinterpret_cast (E.g. in copyColumnViewToCV().)
  7. typedef -> using.

Also, remove repeated code, fix unique_ptr usage.
1. Memory leak in hashing JNI code.
2. Removed convoluted/unnecessary uses of std::unique_ptr.
3. Removed unnecessary uses of std::move.
4. Moved repeated std::transform() for column/column_view conversions to utility function.
5. Removed unnecessary variables, shortened jlong casting.

Yet to address memory leak from bitwiseMergeAndSetValidity.
@mythrocks mythrocks requested a review from a team as a code owner January 11, 2022 18:24
@mythrocks mythrocks self-assigned this Jan 11, 2022
@github-actions github-actions bot added the Java Affects Java cuDF API. label Jan 11, 2022
@mythrocks mythrocks marked this pull request as draft January 11, 2022 18:25
@mythrocks mythrocks changed the title Reduce redundant code in CUDF JNI Reduce redundant code, fix leaks in CUDF JNI Jan 11, 2022
@mythrocks mythrocks changed the title Reduce redundant code, fix leaks in CUDF JNI Fix memory leaks, reduce redundant code in CUDF JNI Jan 11, 2022
java/src/main/native/include/jni_utils.hpp Outdated Show resolved Hide resolved
java/src/main/native/include/jni_utils.hpp Show resolved Hide resolved
java/src/main/native/src/ColumnViewJni.cpp Show resolved Hide resolved
java/src/main/native/src/ColumnViewJni.cpp Outdated Show resolved Hide resolved
Plus, review comments:
1. Renamed `to_jlong(unique_ptr)` to `release_as_jlong(unique_ptr)`.
2. Copyright dates.

Signed-off-by: MithunR <[email protected]>
@mythrocks mythrocks added bug Something isn't working code quality non-breaking Non-breaking change labels Jan 11, 2022
@codecov
Copy link

codecov bot commented Jan 12, 2022

Codecov Report

Merging #10019 (5d3582d) into branch-22.04 (e24fa8f) will increase coverage by 0.04%.
The diff coverage is 0.00%.

Impacted file tree graph

@@               Coverage Diff                @@
##           branch-22.04   #10019      +/-   ##
================================================
+ Coverage         10.37%   10.42%   +0.04%     
================================================
  Files               119      119              
  Lines             20149    20609     +460     
================================================
+ Hits               2091     2148      +57     
- Misses            18058    18461     +403     
Impacted Files Coverage Δ
python/cudf/cudf/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/io.py 0.00% <0.00%> (ø)
python/cudf/cudf/_fuzz_testing/orc.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/parquet.py 0.00% <ø> (ø)
python/cudf/cudf/_fuzz_testing/utils.py 0.00% <0.00%> (ø)
python/cudf/cudf/api/types.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/_base_index.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/__init__.py 0.00% <ø> (ø)
python/cudf/cudf/core/column/column.py 0.00% <0.00%> (ø)
python/cudf/cudf/core/column/decimal.py 0.00% <0.00%> (ø)
... and 81 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 52a61b7...5d3582d. Read the comment docs.

@mythrocks mythrocks requested a review from jlowe January 12, 2022 18:38
@mythrocks
Copy link
Contributor Author

Though it could stand to be improved, I think I'll leave convert_table_for_return() intact for this iteration.

Converted native_jbooleanArray into a separate class.
Added utilities for extracting std::vectors, transform_if_else(), etc.

Signed-off-by: MithunR <[email protected]>
Copy link
Contributor

@revans2 revans2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking really good

java/src/main/native/include/jni_utils.hpp Outdated Show resolved Hide resolved
Removed explicit loops in map initialization.
More to_jlong() fixes.

Signed-off-by: MithunR <[email protected]>
@mythrocks mythrocks changed the title Fix memory leaks, reduce redundant code in CUDF JNI Reduce redundant code in CUDF JNI Jan 13, 2022
@mythrocks mythrocks added improvement Improvement / enhancement to an existing function tech debt and removed bug Something isn't working labels Jan 13, 2022
@github-actions github-actions bot removed conda CMake CMake build issue labels Jan 13, 2022
@mythrocks mythrocks marked this pull request as ready for review January 13, 2022 23:23
@mythrocks mythrocks requested a review from a team as a code owner January 13, 2022 23:23
@mythrocks mythrocks requested review from harrism and jrhemstad and removed request for harrism and jrhemstad January 13, 2022 23:23
@github-actions github-actions bot removed the libcudf Affects libcudf (C++/CUDA) code. label Jan 20, 2022
@mythrocks mythrocks requested a review from jlowe January 20, 2022 23:03
@mythrocks
Copy link
Contributor Author

mythrocks commented Jan 20, 2022

I've rebased this on branch-22.04. Further modifications (E.g. to convert_table_for_return()) can be deferred to a follow-up.
@jlowe: might I bother you to have another look?

@mythrocks mythrocks removed the request for review from a team January 21, 2022 18:08
@mythrocks
Copy link
Contributor Author

Rerun tests

@jlowe
Copy link
Contributor

jlowe commented Jan 21, 2022

I just noticed that there are a number of reinterpret_cast<jlong> occurrences still in TableJni.cpp, as well as a number remaining elsewhere in other files, including ones that are of the form reinterpret_cast<jlong>(x.release()). Were these left intentionally?

@mythrocks
Copy link
Contributor Author

mythrocks commented Jan 24, 2022

I just noticed that there are a number of reinterpret_cast occurrences still in TableJni.cpp...

The ones I left alone (e.g. the join code you added most recently) seemed fine to my eye, at the time. I should probably make another pass and confirm. Perhaps reconsider avoiding changes to convert_table_for_return(). The reinterpret_cast<jlong>(x.release()) sounds avoidable.

@mythrocks
Copy link
Contributor Author

there are a number of reinterpret_cast<jlong> occurrences

These occurrences have been removed from ColumnVectorJni.cpp, ColumnViewJni.cpp, and TableJni.cpp.

@mythrocks
Copy link
Contributor Author

Rebased to latest, including #10040. Waiting on CI before merging.

@mythrocks
Copy link
Contributor Author

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 22daaea into rapidsai:branch-22.04 Jan 25, 2022
@mythrocks mythrocks deleted the jni-jpointerarray-fix branch January 25, 2022 21:36
@mythrocks
Copy link
Contributor Author

Thanks for the reviews, @revans2, @jlowe. This is now merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
improvement Improvement / enhancement to an existing function Java Affects Java cuDF API. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants